A Stream-based Audio Segmentation, C Pre-processing System for Broadcast
نویسنده
چکیده
This paper describes our work on the development of a low latency stream-based audio pre-processing system for broadcast news using model-based techniques. It performs speech/nonspeech classification, speaker segmentation, speaker clustering, gender and background conditions classification. As a way to increase the modelling accuracy our algorithms make extensive use of Artificial Neural Networks (ANN) thus avoiding the rough assumptions normally made about the audio signal distribution. Experiments were conducted on the COST278 multilingual TV broadcast news database and compared with current state of the art algorithms using standard evaluation tools. Additionally we investigated the impact of automatic audio preprocessing system within the recognition using a large broadcast news test database for the European Portuguese. These tests show a small degradation in recognition performance when compared with hand labelled audio segmentation. Our system is part of a prototype close-captioning system that is daily processing the main news show of two Portuguese Broadcasters.
منابع مشابه
An improved preprocessor for the automatic transcription of broadcast news audio stream
This paper deals with the preprocessing of the broadcast news (BN) audio stream for the automatic transcription purposes. The preprocessing consists of the automatic segmentation followed by the broad-class segment identification. The former is capable of detecting speaker and/or acoustic changes in the BN audio stream with the precision being 82.75%. The latter acts as a filter that removes no...
متن کاملSegmentation and classification of broadcast news audio
Broadcast news audio data contains a wide variety of different speakers and audio conditions (channel and background noise). This paper describes a segmentation, gender detection and audio classi cation scheme for such data which aims to provide a speech recogniser with a stream of reasonably-sized segments, each from a single speaker and audio type while discarding non-speech data. Each segmen...
متن کاملAssessing Prosodic And Text Features For Segmentation Of Mandarin Broadcast News
Automatic topic segmentation, separation of a discourse stream into its constituent stories or topics, is a necessary preprocessing step for applications such as information retrieval, anaphora resolution, and summarization. While significant progress has been made in this area for text sources and for English audio sources, little work has been done in automatic segmentation of other languages...
متن کاملStory Segmentation and Detection of Commercials in Broadcast News Video
The Informedia Digital Library Project [Wactlar96] allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informedia digital video library. The success of the Informedia project hinges on two critical assumptions: that we can extract sufficiently accurate speech recognition transcripts from the broadcast audio and that we can seg...
متن کاملNeural Network-Based Learning Kernel for Automatic Segmentation of Multiple Sclerosis Lesions on Magnetic Resonance Images
Background: Multiple Sclerosis (MS) is a degenerative disease of central nervous system. MS patients have some dead tissues in their brains called MS lesions. MRI is an imaging technique sensitive to soft tissues such as brain that shows MS lesions as hyper-intense or hypo-intense signals. Since manual segmentation of these lesions is a laborious and time consuming task, automatic segmentation ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005